智能论文笔记

Can You Fool AI by Doing a 180? $\unicode{x2013}$ A Case Study on Authorship Analysis of Texts by Arata Osada

Jagna Nieuwazny , Karol Nowakowski , Michal Ptaszynski , Fumito Masui

分类：自然语言处理 | 人工智能 | 机器学习

2022-07-19

本文是我们尝试回答两个问题，涵盖道德和作者资格分析领域的问题。首先，由于用于执行作者身份分析的方法意味着他或她创建的内容可以识别作者，因此我们有兴趣找出作者身份证系统是否有可能正确地将作者归因于作者，如果年来，他们经历了重大的心理过渡。其次，从作者的道德价值观演变的角度来看，我们检查了如果作者归因系统在检测单个作者身份方面遇到困难，这将是什么意思。我们着手使用基于预训练的变压器模型的文本分类器执行二进制作者资格分析任务来回答这些问题，并依靠常规相似性指标来回答这些问题。对于测试套装，我们选择了教育史上的日本教育家和专家Arata Osada的作品，其中一半是在第二次世界大战之前写的书，在1950年代又是一半，在此期间，他进行了转变。政治意见的条款。结果，我们能够确认，在10年以上的时间跨度中，Arata Osada撰写的文本，而分类准确性下降了很大的利润率，并且大大低于其他非虚构的文本作家，预测的信心得分仍然与时间跨度较短的水平相似，这表明分类器在许多情况下被欺骗来决定在多年的时间跨度上写的文本实际上是由两个不同的人编写的，这反过来又使我们相信这种变化会影响作者身份分析，并且历史事件对人的著作中所表达的道德观。

translated by 谷歌翻译

Segmentation based tracking of cells in 2D+time microscopy images of macrophages

Seol Ah Park , Tamara Sipka , Zuzana Kriva , George Lutfalla , Mai Nguyen-Chi , Karol Mikula

分类：计算机视觉

2023-01-02

The automated segmentation and tracking of macrophages during their migration are challenging tasks due to their dynamically changing shapes and motions. This paper proposes a new algorithm to achieve automatic cell tracking in time-lapse microscopy macrophage data. First, we design a segmentation method employing space-time filtering, local Otsu's thresholding, and the SUBSURF (subjective surface segmentation) method. Next, the partial trajectories for cells overlapping in the temporal direction are extracted in the segmented images. Finally, the extracted trajectories are linked by considering their direction of movement. The segmented images and the obtained trajectories from the proposed method are compared with those of the semi-automatic segmentation and manual tracking. The proposed tracking achieved 97.4% of accuracy for macrophage data under challenging situations, feeble fluorescent intensity, irregular shapes, and motion of macrophages. We expect that the automatically extracted trajectories of macrophages can provide pieces of evidence of how macrophages migrate depending on their polarization modes in the situation, such as during wound healing.

translated by 谷歌翻译

RT-1: Robotics Transformer for Real-World Control at Scale

Anthony Brohan , Noah Brown , Justice Carbajal , Yevgen Chebotar , Joseph Dabis , Chelsea Finn , Keerthana Gopalakrishnan , Karol Hausman , Alex Herzog , Jasmine Hsu

分类：机器人 | 人工智能 | 自然语言处理 | 计算机视觉 | 机器学习

2022-12-13

By transferring knowledge from large, diverse, task-agnostic datasets, modern machine learning models can solve specific downstream tasks either zero-shot or with small task-specific datasets to a high level of performance. While this capability has been demonstrated in other fields such as computer vision, natural language processing or speech recognition, it remains to be shown in robotics, where the generalization capabilities of the models are particularly critical due to the difficulty of collecting real-world robotic data. We argue that one of the keys to the success of such general robotic models lies with open-ended task-agnostic training, combined with high-capacity architectures that can absorb all of the diverse, robotic data. In this paper, we present a model class, dubbed Robotics Transformer, that exhibits promising scalable model properties. We verify our conclusions in a study of different model classes and their ability to generalize as a function of the data size, model size, and data diversity based on a large-scale data collection on real robots performing real-world tasks. The project's website and videos can be found at robotics-transformer.github.io

translated by 谷歌翻译

Improving Group Lasso for high-dimensional categorical data

Szymon Nowakowski , Piotr Pokarowski , Wojciech Rejchel , Agnieszka Sołtys

分类： (统计)机器学习

2022-10-25

Sparse modelling or model selection with categorical data is challenging even for a moderate number of variables, because one parameter is roughly needed to encode one category or level. The Group Lasso is a well known efficient algorithm for selection continuous or categorical variables, but all estimates related to a selected factor usually differ. Therefore, a fitted model may not be sparse, which makes the model interpretation difficult. To obtain a sparse solution of the Group Lasso we propose the following two-step procedure: first, we reduce data dimensionality using the Group Lasso; then to choose the final model we use an information criterion on a small family of models prepared by clustering levels of individual factors. We investigate selection correctness of the algorithm in a sparse high-dimensional scenario. We also test our method on synthetic as well as real datasets and show that it performs better than the state of the art algorithms with respect to the prediction accuracy or model dimension.

translated by 谷歌翻译

Approaching English-Polish Machine Translation Quality Assessment with Neural-based Methods

Artur Nowakowski

分类：自然语言处理

2022-09-22

本文介绍了我们对Polval 2021任务2的贡献：翻译质量评估指标的评估。我们描述了使用预先训练的语言模型和最新框架进行的实验，以在任务的非盲和盲版中进行翻译质量评估。我们的解决方案在非蓝色版本中排名第二，在盲版中排名第三。

translated by 谷歌翻译

Code as Policies: Language Model Programs for Embodied Control

Jacky Liang , Wenlong Huang , Fei Xia , Peng Xu , Karol Hausman , Brian Ichter , Pete Florence , Andy Zeng

分类：机器人

2022-09-16

已经证明，经过代码完成培训的大型语言模型（LLMS）能够合成DocStrings的简单Python程序[1]。我们发现这些代码编写的LLM可以被重新使用以编写机器人策略代码，给定自然语言命令。具体而言，策略代码可以表达处理感知输出的功能或反馈循环（例如，从对象检测器[2]，[3]）并参数化控制原始API。当作为输入提供了几个示例命令（格式为注释）后，然后是相应的策略代码（通过少量提示），LLMS可以接收新命令并自主重新编写API调用以分别生成新的策略代码。通过链接经典的逻辑结构并引用第三方库（例如，numpy，shapely）执行算术，以这种方式使用的LLM可以编写（i）（i）表现出空间几何推理的机器人策略，（ii）（ii）将其推广到新的说明和新指令和新指令和（iii）根据上下文（即行为常识）规定模棱两可的描述（例如“更快”）的精确值（例如，速度）。本文将代码作为策略介绍：语言模型生成程序的以机器人为中心的形式化（LMP），该程序可以代表反应性策略（例如阻抗控制器），以及基于Waypoint的策略（基于远见的选择，基于轨迹，基于轨迹，控制），在多个真实的机器人平台上展示。我们方法的核心是促使层次代码 - 代码（递归定义未定义的功能），该代码可以编写更复杂的代码，还可以改善最新的代码，以解决HOMANEVAL [1]基准中的39.8％的问题。代码和视频可从https://code-as-policies.github.io获得。

translated by 谷歌翻译

Adam Mickiewicz University at WMT 2022: NER-Assisted and Quality-Aware Neural Machine Translation

Artur Nowakowski , Gabriela Pałka , Kamil Guttmann , Mikołaj Pokrywka

分类：自然语言处理

2022-09-07

本文介绍了亚当·米基维奇大学（Adam Mickiewicz University）（AMU）提交的《 WMT 2022一般MT任务》的踪迹。我们参加了乌克兰$ \ leftrightarrow $捷克翻译指示。这些系统是基于变压器（大）体系结构的四个模型的加权合奏。模型使用源因素来利用输入中存在的命名实体的信息。合奏中的每个模型仅使用共享任务组织者提供的数据培训。一种嘈杂的反向翻译技术用于增强培训语料库。合奏中的模型之一是文档级模型，该模型在平行和合成的更长序列上训练。在句子级的解码过程中，集合生成了N最佳列表。 n-最佳列表与单个文档级模型生成的n-最佳列表合并，该列表一次翻译了多个句子。最后，使用现有的质量估计模型和最小贝叶斯风险解码来重新列出N最好的列表，因此根据彗星评估指标选择了最佳假设。根据自动评估结果，我们的系统在两个翻译方向上排名第一。

translated by 谷歌翻译

Co-Imitation: Learning Design and Behaviour by Imitation

Chang Rajani , Karol Arndt , David Blanco-Mulero , Kevin Sebastian Luck , Ville Kyrki

分类：机器学习 | 人工智能 | 机器人

2022-09-02

机器人的共同适应一直是一项长期的研究努力，其目的是将系统的身体和行为适应给定的任务，灵感来自动物的自然演变。共同适应有可能消除昂贵的手动硬件工程，并提高系统性能。共同适应的标准方法是使用奖励功能来优化行为和形态。但是，众所周知，定义和构建这种奖励功能是困难的，并且通常是一项重大的工程工作。本文介绍了关于共同适应问题的新观点，我们称之为共同构图：寻找形态和政策，使模仿者可以紧密匹配演示者的行为。为此，我们提出了一种通过匹配示威者的状态分布来适应行为和形态的共同模拟方法。具体而言，我们专注于两种代理之间的状态和动作空间不匹配的挑战性情况。我们发现，共同映射会增加各种任务和设置的行为相似性，并通过将人的步行，慢跑和踢到模拟的人形生物转移来证明共同映射。

translated by 谷歌翻译

HTML版本

Entity Graph Extraction from Legal Acts -- a Prototype for a Use Case in Policy Design Analysis

Anna Wróblewska , Bartosz Pieliński , Karolina Seweryn , Karol Saputa , Aleksandra Wichrowska , Sylwia Sysko-Romańczuk , Hanna Schreiber

分类：自然语言处理

2022-09-02

本文介绍了有关开发的原型的研究，以服务公共政策设计的定量研究。政治学的这种子学科着重于确定参与者，之间的关系以及在健康，环境，经济和其他政策方面可以使用的工具。我们的系统旨在自动化收集法律文件，用机构语法注释它们的过程，并使用超图来分析关键实体之间的相互关系。我们的系统经过了《联合国教科文组织公约》的保护，以保护2003年的无形文化遗产，这是一份法律文件，该文件规定了确保文化遗产的国际关系的基本方面。

translated by 谷歌翻译

HTML版本

Distance-based detection of out-of-distribution silent failures for Covid-19 lung lesion segmentation

Camila Gonzalez , Karol Gotkowski , Moritz Fuchs , Andreas Bucher , Armin Dadras , Ricarda Fischbach , Isabel Kaltenborn , Anirban Mukhopadhyay

分类：计算机视觉 | 机器学习

2022-08-05

在胸部计算机断层扫描（CT）扫描中，自动分割地面玻璃的不透明和固结可以在高资源利用时期减轻放射科医生的负担。但是，由于分布（OOD）数据默默失败，深度学习模型在临床常规中不受信任。我们提出了一种轻巧的OOD检测方法，该方法利用特征空间中的Mahalanobis距离，并无缝集成到最新的分割管道中。简单的方法甚至可以增加具有临床相关的不确定性定量的预训练模型。我们在四个胸部CT分布偏移和两个磁共振成像应用中验证我们的方法，即海马和前列腺的分割。我们的结果表明，所提出的方法在所有探索场景中有效地检测到遥远和近型样品。

translated by 谷歌翻译